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ABSTRACT 

I present an analysis for fitting cosmological parameters from a Hubble Diagram of a 
standard candle with unknown intrinsic magnitude dispersion. The dispersion is deter- 
mined from the data themselves, simultaneously with the cosmological parameters. This 
contrasts with the strategies used to date. The advantages of the presented analysis are 
that it is done in a single fit (it is not iterative), it provides a statistically founded and 
unbiased estimate of the intrinsic dispersion, and its cosmological-parameter uncertain- 
ties account for the intrinsic dispersion uncertainty. Applied to Type la supernovae, my 
strategy provides a statistical measure to test for sub-types and assess the significance 
of any magnitude corrections applied to the calibrated candle. Parameter bias and dif- 
ferences between likelihood distributions produced by the presented and currently-used 
fitters are negligibly small for existing and projected supernova data sets. 

Subject headings: Supernovae: Data Analysis and Techniques 



1. Introduction 



The homogeneous nature of Type la supernovae (SNe la) makes them a popular tool for 
measuring cosmological distances. After empirical corrections based on light curve shape, color, and 
spectral features, the absolute magnitude (or distance modulus) of a supernova can be determined to 
0.12 mag dGuy et al.|2007||Jha et al.|2007t|Conley et al.|2008l|Bailey et al.|2009D . SNe la have been 



used to successfully measure the expansion rate of the universe (the Hubble Constant; Freedman 
et al.|200l Riess et al.|2009 ), discover its accelerated expansion ( Riess et al.|1998 Perlmutter et al 



1999), and measure the properties of the dark energy responsible for that acceleration (Hicken et al 



2009 Amanuhah et al. 2010) 



The small scatter in the peak brightness of SN la luminosities is inferred from the small 



residuals in their Hubble Diagrams (Kowal 1968); the intrinsic supernova magnitude dispersion is 



measured from differences between observed magnitudes and those predicted by the cosmological 
model, e.g. the linear Hubble law for low redshift. Although there are theoretical explanations for 
this dispersion including intrinsic progenitor properties, circumstellar dust, and viewing angle (see 
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e.g. Kasen|2006 Goobar|2008 Wang et al.|2003 ), in practice the amount of dispersion is determined 
empirically from the data themselves. 

The luminosity dispersions of supernova subsets are statistics that can be used to compare 
and identify SN la subclasses. The prevailing belief is that the intrinsic luminosity of an individual 
supernova, including line-of-sight effects, is encoded non-trivially within a finite set of physical and 
geometric parameters. The "intrinsic" dispersion arises from our lack of observational access to all 
those parameters and incomplete knowledge of how to exploit those that are available. It is possible 
that SN la subclasses with different average luminosities are responsible for some of the intrinsic 
dispersion seen in current data. Correlations between supernova light curves and spectral features 



(Benetti et al. 2005; Bailey et al. 2009, Wang et al. 2009; Foley & Kasen 2010) and host galaxy 



(Benetti et al. 


2005 


(Sullivan et al. 


2010 



(Sullivan et al. 2010; Lampeitl et al. 2010) give evidence that SNe la need to be modeled in finer 
detail using an expanded suite of data. Likelihood surfaces of intrinsic dispersion for supernova 
subsets provide a statistical measure to test whether data are best described by a single intrinsic 
dispersion. 

This paper presents the methodology for simultaneously fitting for the intrinsic dispersion and 
the cosmological parameters that specify the dynamics of the cosmic expansion. Although I present 
straightforward textbook likelihood analysis, it has yet to be applied on supernova-cosmology data. 



My approach contrasts with that of Shafieloo et al. (2010), who suggest using Monte Carlo analysis 
of statistics that are insensitive to the intrinsic dispersion. This paper is organized as follows: 
^ presents the likelihood equation and contrasts it with the commonly used method. Results of 
simulations are given in ^ that show the quantitative differences between the results of the two 
analyses. I finish with conclusions in Q 



2. The Likelihood 

Given a set of measured quantities fii with covariance C at known points Zi and a model 
F{zi; 6) for the corresponding true values parameterized by 9, the Gaussian likelihood L can be 
expressed as 

£ = -21nL 

= Indet C + (/I - F(z; 9)fC-\fx - F(z; 6)) (1) 

neglecting the irrelevant 2tt term. For a supernova-cosmology analysis fii and Zi correspond to the 
estimators for the distance modulus and redshift of supernova i. The function F is the theoretical 
prediction for the distance modulus as a function of redshift and a set of cosmological parameters, 
e.g. Qm, ^de, and w. 

The covariance matrix gives the deviation of all possible measurements for all possible super- 
novae from the mean distance moduli at the given redshifts; C not only has a contribution from 
measurement uncertainty in supernova magnitudes. Cm, but also from the fact that supernovae 
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are drawn from a population with intrinsic magnitude dispersion. Assuming that all supernovae 
are independently drawn from the same luminosity function with unknown dispersion, the total 
covariance matrix is C = Cm + C/I making C a function of the model parameter aj. I take the 
intrinsic scatter to be parameterized by the variance, not the standard deviation. 

The determination of the best-fit and confidence regions for the parameters follows the standard 
procedure of minimizing and mapping isocontours on the surface of Eqn. [l} 

The prevalent supernova-cosmology analysis proceeds differently. The intrinsic supernova vari- 
ance is not treated as a fit parameter: the first term in the log-likelihood in Eqn. [T] is ignored, some 
initial guess of aj is included in C, and the (the second term in Equation [l| is minimized to get 
the best-fit parameters. Then holding those parameters fixed, the value of aj that gives \^ /dof = 1 
is determined. This process is repeated until the fits converge to stationary values. Alternatively, 
this process is applied to a low-redshift subsample from which a (t| is measured as the dispersion 
from the linear Hubble law, and is inserted in the data covariance matrix of the full sample. The 
closeness of the resulting x^/dof to unity checks the consistency between the dispersion of the 
training and full sets. 



3. Simulation 

I simulate experiments specified by the number of supernovae they produce, either = 50 or 
1000 uniformly distributed from 0.08 < z < 0.8, and the distance modulus measurement uncertainty 
per supernova Us, either 0.05, 0.1, 0.2, or 0.02 -|- 0.025z mag. The data are supplemented with an 
additional 100 SNe aX z = 0.05 each with a measurement uncertainty of 0.02 mag. The measurement 
covariance is Cm,ij = ^ijO'l- The supernovae have an intrinsic dispersion of ai = 0.1 mag. The 
set of experimental realizations for each case is generated with the same random-number generator 
seed. All experiments occur in a fiat ACDM universe with = 0.27 and w = —1. 

The data from each realized experiment are analyzed in two ways. First, the data are fit using 
the full Equation [T] to a model with a fiat-universe dark-energy cosmology with constant equation 
of state parameterized by I^a/ and w, and an intrinsic supernova dispersion (t|. This is referred to 
as the InL fit. Second, the data are initially fit to the cosmological model but holding o"? = fixed. 
Then, holding the best-fit cosmological parameters fixed, the value of af that gives /dof = 1 is 
calculated. This process is repeated twice more starting with the updated values of o"|; I find that 
the fit results converge after three iterations. These are referred to as the fits. 

For each type of experiment, I generate an ensemble of realizations each analyzed using the In L 
and iterated fits. For the A^ = 50 experiments I generate 5000 realizations and for the A^ = 1000 
experiments, 1000 realizations. The fitting is performed with the MIGRAD minimization of the 



Minuit (James &: Roos 1975) implementation in ROOT (Antcheva et al. 2009). The parameter 
confidence intervals are taken directly from the extrema of the Cmin + 1 contours (using the MINOS 
function call); the contours can be asymmetric around the extrema so my quoted uncertainties are 
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half the interval length. 

In terms of the fit, the cosmology model is pathological when w = and the dark energy 
is dynamically indistinguishable from non-relativisitic matter. The minimization can fail if the 
maximum likelihood approaches w = and the parameter-uncertainty determination may fail 
when w = falls within the accepted confidence region. Given that the input cosmology has 
w = —1, the fitter encounters this condition with regularity only when the data quality is poor. 
This is seen in Figure[l| which shows the histogram of the best-fit w from the In L fit of the N = 50, 
<T/ = 0.2 run. The open curve represents fits that succeeded in getting the asymmetric uncertainties 
in i^M, the shaded curve represents those that failed. The other runs with more SNe and/or lower 
measurement uncertainty produce significantly fewer or no such failures. In my analysis I include 
only those realizations with successful uncertainty determination; the exclusion of the failed fits is 
not expected to bias distributions of parameter uncertainties nor the determination of aj. 

The fits converge to stable values by the third iteration. For example in the A'" = 50, 
as = 0.2 run, the distribution of the shift in aj between the second and third iterations has a mean 
of 5.8 X 10~^ and an RMS of 1.8 x 10^'', both small compared to the input aj = 0.01. 

For each N~as pair I calculate the averages of the cosmological-parameter uncertainties and 
the best-fit and uncertainties for the intrinsic-dispersion parameter. To directly compare the two 
fitters, I also calculate the mean and RMS of the difference in the uncertainties they return. The 
results are given in Table [T} 

The distributions are skewed by amounts that depend on the statistic and the experimental 
configuration. It is well known that parameter confidence regions are not elliptical and that the 
size of the region depends on where the best fit falls ( Weller fc Albrecht||2002 ). In addition, as seen 



in Figure [T| the fitter fails preferentially in the tail of the distribution where u) = is favored. The 
asymmetries are therefore accentuated when parameter uncertainties are large; among the cases 
considered in this paper the i^M fits of the = 50 runs and the w fit of the iV = 50, = 0.2 run 
are particularly affected. In these extreme cases, the quoted averages should be interpreted with 
care. 

Both the InL and fits return asymmetric af distributions. The asymmetry is more pro- 
nounced when the InL fits have higher crlaj) and the corresponding X'^'fit distributions are even 
more skewed. For larger N and/or as as decreases the averages of the InL- fit distribution approach 
the input intrinsic dispersion. The averages of the x^"fit distribution also approach the input as 
as decreases but the bias remains when going from = 50 to = 1000. To illustrate. Figure [2] 
plots for the A^ = 50, as = 0.2 run histograms of aj from both fits. The two distributions are 
different and while both are asymmetric that of the the x^"fit has a broader tail. Table [l] gives 
(^/ini) ~ 0.00985 and (o'/^2) = 0.01009. Figure 3 shows the corresponding histograms for the 
A^ = 1000, as = 0.2 run, and a histogram of the differences in the intrinsic dispersions from the 



two fits, (T? 2 — crj , T. Here the asymmetry is subtle and is more clearly seen in the differences. 



and (o-|ij^i) = 0.00998 approaches the input 0.001 whereas {aj 2) = 0.01011 remains offset. 
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ln(L) 



Fig. 1. — The histogram of the best-fit w from the InL fit of the = 50, aj = 0.2 run, the one 
with largest parameter uncertainties among the cases considered in this study. The open curve 
includes fits that succeed in determining the uncertainties in ^Im, the shaded curve includes those 
that fail. The fitter often fails when the solution converges toward w = 0. 



- 6 - 



Table 1: The averages of the cosmological-parameter uncertainties and the best-fit and uncertainties 
for the intrinsic-dispersion parameter for all the simulated data sets. Also tabulated are the average 
and RMS of the parameter uncertainties subtracted by those of the logL fits (denoted by A). The 
data are analyzed either with the In det C term in Eqn. [l] (denoted by In L) or without and holding 
aj fixed . For the latter case, results are given for one or three iterations (denoted by Xi ^^id xi 
respectively) . 



N 




Fit 


(^{!^A/)> 




RMS{Aa{QM)) 


{a{w)) 


{Aa{w)) 


RMS{Aa{w)) 










InL 


0.191 






0.311 






0.00990 


0.00126 




0.05 


xi 


0.080 


-0.11794 


0.1506 


0.091 


-0.22015 


0.0337 


0.01008 








xi 


0.205 


0.00101 


0.0019 


0.312 


0.00129 


0.0018 


0.01003 








InL 


0.243 






0.360 






0.00992 


0.00137 




0.1 


X? 


0.127 


-0.11839 


0.2486 


0.120 


-0.23983 


0.0357 


0.01011 




50 




xi 


0.260 


0.00121 


0.0094 


0.360 


0.00101 


0.0050 


0.01005 






InL 


0.267 






0.458 






0.00985 


0.00144 




0.2 


X? 


0.179 


-0.14510 


0.2027 


0.186 


-0.26416 


0.0366 


0.01012 








xi 


0.332 


0.00264 


0.0386 


0.452 


0.00105 


0.0118 


0.01009 








InL 


0.266 






0.396 






0.00988 


0.00137 




slope 


X? 


0.157 


-0.12438 


0.2123 


0.162 


-0.23292 


0.0397 


0.01012 








xi 


0.283 


0.00107 


0.0145 


0.395 


0.00095 


0.0062 


0.01006 








InL 


0.040 






0.087 






0.01000 


0.00052 




0.05 


X? 


0.017 


-0.02280 


0.0035 


0.037 


-0.04995 


0.0032 


0.01003 








xi 


0.040 


0.00003 


0.0000 


0.087 


0.00006 


0.0001 


0.01002 








InL 


0.050 






0.109 






0.00999 


0.00076 




0.1 


X? 


0.031 


-0.01944 


0.0095 


0.062 


-0.04632 


0.0059 


0.01008 




1000 




xi 


0.050 


0.00004 


0.0003 


0.109 


0.00009 


0.0005 


0.01003 






InL 


0.081 






0.167 






0.00998 


0.00123 




0.2 


xi 


0.074 


-0.00670 


0.3741 


0.088 


-0.07885 


0.0114 


0.01023 








xi 


0.081 


0.00008 


0.0013 


0.167 


0.00016 


0.0024 


0.01011 








InL 


0.057 






0.116 






0.01000 


0.00076 




slope 


xi 


0.035 


-0.02110 


0.0096 


0.065 


-0.05052 


0.0063 


0.01007 








xi 


0.057 


0.00003 


0.0005 


0.116 


0.00007 


0.0011 


0.01004 
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Fig. 2. — The histograms of aj as determined by the InL (sohd) and (dashed) fits for = 50 
and as = 0.2. The distributions are sUghtly asymmetric with broader tails at larger values. The 
In L fits fail more frequently than fits, for direct comparison both histograms include realizations 
that succeed in both fits. 
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0.016 



Mean 0.000132 
RMS 0.001008 




Fig. 3. — The top plot shows histograms of aj as determined by the InL (sohd) and (dashed) 
fits for N = 1000 and (jg = 0.2. Unhke the = 50, ag = 0.2 case, both fitting methods succeed 
for all realizations. The bottom plot shows the histogram of their difference <7^^2 — '^finL ^ith the 
best-fit Gaussian overplotted. Note that the distribution is slightly asymmetric with all the points 
on the right-side tail falling well above the Gaussian curve. 
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Though not directly appHcable to the cases simulated for this paper, I point out that when 
all supernovae have the same measurement uncertainty the distribution of intrinsic dispersions 
that give /dof = 1 is known trivially. An experiment with a realized x|j for an input intrinsic 

,2 
10 



dispersion cr?Q has an inferred intrinsic dispersion 



2 _ K+^io) 2 2 

dof 

The (t| distribution thus corresponds directly with the distribution. This does not apply to the 
experiments simulated in this paper that have a low-redshift set of supernovae with measurement 
uncertainty that differs from those of the high-redshift set. 

The best-fit cosmological parameters differ for the two fitters only when they deduce different 
intrinsic dispersions. As seen in Table [l] and the example shown in Figure [sj the ct|'s returned by 
the two fits agree with little bias within expected measurement uncertainties. I confirm that the fits 
also find similar optimal and w. Figure |4] shows histograms for ^m,x'^ — ^M.inL and w^2 — wx-^l 
from the representative N = 1000 and Os = 0.2 run. The both are highly peaked around zero with 
ranges much smaller than the statistical measurement uncertainty. 

In an individual realization of an experiment the covariance between the intrinsic dispersion 
and the cosmological parameters in the InL fits can be non-zero. This is illustrated by a typical 
realization of a = 1000 and ag = 0.1 experiment; the correlation coefficients between (t| and 
and w are —0.018 and 0.019 respectively. The corresponding 1- and 2-a confidence regions in Jl-o"! 
and w-ali space are shown in Figure [S] The likelihood is pronouncedly asymmetric for Q^m and w, 
whereas it is close to symmetric in it|. The fits do not provide an uncertainty for the intrinsic 
dispersion nor their propagated effect on the other parameters. 

The average parameter uncertainties for a given run must differ between fitters. The minimum 
of C in the InL fit is less than (or equal to) the minimum so >C™^ + 1 < C^2^ + 1, the conditions 
that define the 1-a contours. Also, the extra cr| dimension in the InL fit opens room for a broader 
range of acceptable cosmological-parameter values to be contained within the confidence region. 

Except for the $7 a/ uncertainties in the A^ = 50 runs, both fits give similar average uncertainties 
in the cosmological parameters. On a per-realization level, the average and RMS of the difference 
between the x^ and InL fits ((Acr(f]Af)), RMS{l\a{VLM)), {Aa{w)), and RMS{Aa{w))in Table[l| 
are small compared to the uncertainties themselves. The distributions of the difference in parameter 
uncertainties between the third x^"fit iteration and the InL fit (7{wa) are shown in Figure [g] for the 
case of A^ = 1000 and as = 0.02. Although they both are close to Gaussian, there is a slight excess 
in the high end (corresponding to larger x^- or smaller InL-fit uncertainties) just as is the case in 
the distribution of differences shown in Figure [sj 



Amanullah et al. (2010) have shown that supernova samples from different observatories exhibit 
different intrinsic magnitude dispersions. I run the A^ = 1000, ag = 0.1 case fitting for two intrinsic 
dispersion parameters, one for the nearby z = 0.05 set and another for the higher-redshift set. 
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Fig. 4. — Histograms of the difference in tlie best-fit parameters returned by the ^ and In L fitters, 
^M,x^ ~ ^A/,lnL (top) and — WinL (bottom), for the = 1000 and fj^ = 0.02 experiment. 
Overplotted on each are the Gaussian best-fits to the data. 
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Fig. 5. — The r^Af-crl (top) and w-aj (bottom) 1- and 2-a confidence regions for one realization 
of a = 1000 and ag = 0.1 experiment. The parameters have small 0.019 correlation. 
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Fig. 6. — Histograms of the difference in the parameter micertainties returned by the In L 

fitters, cr(r2Af)^2 — o-{QM)inL (top) and a{w)^2 —cr{w)inL (bottom), for the N = 1000 and as = 0.02 
experiment. Overplotted on each are the Gaussian best-fits to the data. 
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The fitters return averages for the intrinsic dispersion uncertainties, (a^af)), of 0.00148 for low 
redshift and 0.00090 for high redshift. These numbers provide a quantitative measure of how well 
possible systematic differences between the two populations could be resolved. For comparison, 
(cj((t|)) = 0.00076 when a single aj is fit for all supernovae. 



4. Conclusions 

1 have shown how to fit for cosmological parameters with SNe la when the intrinsic dispersion of 
the standard candle is unknown. My standard likelihood function has not been used in cosmological 
analysis to date. 1 show via simulation that, on average, our likelihood function is maximal at the 
values of the input parameters including the intrinsic dispersion. The presented and previously 
used iterative fitting methods do not give biases in the best-fit cosmological parameters and any 
differences in a single experiment are due to realization scatter. The fitter methods do return 
different intrinsic dispersions and parameter uncertainties. The procedure presented here has the 
advantage that it includes the covariance of the intrinsic dispersion with the other parameters in 
its error propagation, and the fit is done in a single iteration. 

The methodology can be extended to cases where multiple dispersion parameters are fit. 1 
present an example taking the low- and high-redshift sets as being drawn from different magni- 
tude distributions. The same approach can be used to check whether different supernova subsets 
(tagged for example by redshift, host-galaxy characteristics or spectral features) exhibit statistically 
significant differences in their population characteristics. 

The approach is appropriate for any analysis that uses a statistic for which the tracer has an 
intrinsic dispersion that must be determined from the data. For example, in weak gravitational 
lensing the measurement of correlated shear is obscured by the unknown intrinsic shape of individual 
galaxies. The intrinsic dispersion in galaxy ellipticities can be made a fit parameter determined 
simultaneously with those of cosmological interest. 

Inclusion of the likelihood-function normalization when fitting is not new to astronomy nor 



cosmology; Wheaton et al. ( 1995 ) showed its importance in shot-noise-dominated photometry and 



it is retained in other cosmological analyses (see e.g. Bridle et al. 2002; Taylor Sz Kitching 2010). 



Holsclaw et al. (2010) do include a fit parameter in the data covariance for their supernova analysis 



although there it serves as a hyperparameter of the Gaussian-process prior on w{z). [Kessler et al. 



(2010) include the normalization term; though containing no fit parameters it is needed to directly 
compare the x^'s derived from different light-curve models. 

This paper gives a simplified view of how the standard candle nature of SNe la is used in 
cosmology analysis. SNe la are in fact calibrated candles; independent observables (light-curve 
shape, colors, spectral features) are correlated with peak absolute magnitude to correct and lower 
the dispersion in distance determinations. I advocate that intrinsic dispersion be measured as a fit 
parameter from the data simultaneously with the magnitude-correction and cosmological parame- 
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ters. This provides a new perspective in how we search for magnitude corrections that make SNe 

la better calibrated candles. In the past we have sought parameterized magnitude corrections that 
minimize distance dispersion; we can now seek corrections and their inferred intrinsic dispersions 
that are most consistent with observations and are statistically favored over having no correction. 
Application of this technique to real SN data sets is the subject of ongoing work. 

I acknowledge fruitful discussions with Eric Linder, David Rubin, and Ramon Miquel. This 
work was supported by the Director, Office of Science, Office of High Energy Physics, of the U.S. 
Department of Energy under Contract No. DE-AC02-05CH11231. 
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